Goto

Collaborating Authors

 cross-silo federated learning


Throughput-Optimal Topology Design for Cross-Silo Federated Learning

Neural Information Processing Systems

Federated learning usually employs a client-server architecture where an orchestrator iteratively aggregates model updates from remote clients and pushes them back a refined model. This approach may be inefficient in cross-silo settings, as close-by data silos with high-speed access links may exchange information faster than with the orchestrator, and the orchestrator may become a communication bottleneck. In this paper we define the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit. We also propose practical algorithms that, under the knowledge of measurable network characteristics, find a topology with the largest throughput or with provable throughput guarantees.


On Privacy and Personalization in Cross-Silo Federated Learning

Neural Information Processing Systems

While the application of differential privacy (DP) has been well-studied in cross-device federated learning (FL), there is a lack of work considering DP and its implications for cross-silo FL, a setting characterized by a limited number of clients each containing many data subjects. In cross-silo FL, usual notions of client-level DP are less suitable as real-world privacy regulations typically concern the in-silo data subjects rather than the silos themselves. In this work, we instead consider an alternative notion of silo-specific sample-level DP, where silos set their own privacy targets for their local examples. Under this setting, we reconsider the roles of personalization in federated learning. In particular, we show that mean-regularized multi-task learning (MR-MTL), a simple personalization framework, is a strong baseline for cross-silo FL: under stronger privacy requirements, silos are incentivized to federate more with each other to mitigate DP noise, resulting in consistent improvements relative to standard baseline methods. We provide an empirical study of competing methods as well as a theoretical characterization of MR-MTL for mean estimation, highlighting the interplay between privacy and cross-silo data heterogeneity. Our work serves to establish baselines for private cross-silo FL as well as identify key directions of future work in this area.


FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings

Neural Information Processing Systems

Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of few ($2$--$50$) reliable clients, each holding medium to large datasets, and is typically found in applications such as healthcare, finance, or industry. While previous works have proposed representative datasets for cross-device FL, few realistic healthcare cross-silo FL datasets exist, thereby slowing algorithmic research in this critical application. In this work, we propose a novel cross-silo dataset suite focused on healthcare, FLamby (Federated Learning AMple Benchmark of Your cross-silo strategies), to bridge the gap between theory and practice of cross-silo FL.FLamby encompasses 7 healthcare datasets with natural splits, covering multiple tasks, modalities, and data volumes, each accompanied with baseline training code. As an illustration, we additionally benchmark standard FL algorithms on all datasets.Our flexible and modular suite allows researchers to easily download datasets, reproduce results and re-use the different components for their research.


A Coopetitive-Compatible Data Generation Framework for Cross-silo Federated Learning

Nguyen, Thanh Linh, Pham, Quoc-Viet

arXiv.org Artificial Intelligence

Cross-silo federated learning (CFL) enables organizations (e.g., hospitals or banks) to collaboratively train artificial intelligence (AI) models while preserving data privacy by keeping data local. While prior work has primarily addressed statistical heterogeneity across organizations, a critical challenge arises from economic competition, where organizations may act as market rivals, making them hesitant to participate in joint training due to potential utility loss (i.e., reduced net benefit). Furthermore, the combined effects of statistical heterogeneity and inter-organizational competition on organizational behavior and system-wide social welfare remain underexplored. In this paper, we propose CoCoGen, a coopetitive-compatible data generation framework, leveraging generative AI (GenAI) and potential game theory to model, analyze, and optimize collaborative learning under heterogeneous and competitive settings. Specifically, CoCoGen characterizes competition and statistical heterogeneity through learning performance and utility-based formulations and models each training round as a weighted potential game. We then derive GenAI-based data generation strategies that maximize social welfare. Experimental results on the Fashion-MNIST dataset reveal how varying heterogeneity and competition levels affect organizational behavior and demonstrate that CoCoGen consistently outperforms baseline methods.


Free-Rider and Conflict Aware Collaboration Formation for Cross-Silo Federated Learning

Neural Information Processing Systems

Federated learning (FL) is a machine learning paradigm that allows multiple FL participants (FL-PTs) to collaborate on training models without sharing private data. Due to data heterogeneity, negative transfer may occur in the FL training process. This necessitates FL-PT selection based on their data complementarity. In cross-silo FL, organizations that engage in business activities are key sources of FL-PTs. The resulting FL ecosystem has two features: (i) self-interest, and (ii) competition among FL-PTs.


Local Superior Soups: A Catalyst for Model Merging in Cross-Silo Federated Learning

Neural Information Processing Systems

Federated learning (FL) is a learning paradigm that enables collaborative training of models using decentralized data. Recently, the utilization of pre-trained weight initialization in FL has been demonstrated to effectively improve model performance. However, the evolving complexity of current pre-trained models, characterized by a substantial increase in parameters, markedly intensifies the challenges associated with communication rounds required for their adaptation to FL. To address these communication cost issues and increase the performance of pre-trained model adaptation in FL, we propose an innovative model interpolation-based local training technique called Local Superior Soups.''Our This approach acts as a catalyst for the seamless adaptation of pre-trained models in in FL.We demonstrated its effectiveness and efficiency across diverse widely-used FL datasets.


Privacy Preserving and Robust Aggregation for Cross-Silo Federated Learning in Non-IID Settings

Arazzi, Marco, Cihangiroglu, Mert, Nocera, Antonino

arXiv.org Artificial Intelligence

Federated Averaging remains the most widely used aggregation strategy in federated learning due to its simplicity and scalability. However, its performance degrades significantly in non-IID data settings, where client distributions are highly imbalanced or skewed. Additionally, it relies on clients transmitting metadata, specifically the number of training samples, which introduces privacy risks and may conflict with regulatory frameworks like the European GDPR. In this paper, we propose a novel aggregation strategy that addresses these challenges by introducing class-aware gradient masking. Unlike traditional approaches, our method relies solely on gradient updates, eliminating the need for any additional client metadata, thereby enhancing privacy protection. Furthermore, our approach validates and dynamically weights client contributions based on class-specific importance, ensuring robustness against non-IID distributions, convergence prevention, and backdoor attacks. Extensive experiments on benchmark datasets demonstrate that our method not only outperforms FedAvg and other widely accepted aggregation strategies in non-IID settings but also preserves model integrity in adversarial scenarios. Our results establish the effectiveness of gradient masking as a practical and secure solution for federated learning.


Review for NeurIPS paper: Throughput-Optimal Topology Design for Cross-Silo Federated Learning

Neural Information Processing Systems

In general the paper reads well, there are only minor details I have regarding clarity of presentation. I do appreciate being upfront about what kind of scenarios this work applies to and what not. Sec 1 I think very well grounds the rest of the work in prior research. In particular, in contrast to some related recent works, I feel this thorough grounding helps the authors to ask better questions, and in this sense I feel this work could help inspire further research. L65: I am not sure how compression plays into the preference for sychronous algorithms. It is relevant though, including for the precise topic studied here. Perhaps missing reference is Caldas et al., "Expanding the Reach of Federated Learning by Reducing Client Resource Requirements" which does both model and update compression as in the text.


Review for NeurIPS paper: Throughput-Optimal Topology Design for Cross-Silo Federated Learning

Neural Information Processing Systems

The paper proposes methods for designing communication graph for the decentralized periodic averaging SGD (DPASGD) in the federated learning set up focusing on reducing the per-iteration complexity (cycle time). The reviews were very appreciative of the good system and experimental design aspects of the paper that accounts for various types of delays in realistic scenarios. I would like to thank the authors for their effort. The reviewers were quite engaged and have provided many useful feedback and I hope these will be used to improve the paper. In particular, I would like to comment of few points -- please see full reviews for details - Although the authors motivate the need for focusing on cycle time over convergence rate in the introduction, based on the reviews, I believe it would be useful to include this discussion explicitly as a highlighted paragraph or subsection (see also comments by R2 on digraph constraint) - I would also encourage you to consider the title change suggestion by R2 (or something similar) as I and other reviewers agree that the current title is too generic.


On Privacy and Personalization in Cross-Silo Federated Learning

Neural Information Processing Systems

While the application of differential privacy (DP) has been well-studied in cross-device federated learning (FL), there is a lack of work considering DP and its implications for cross-silo FL, a setting characterized by a limited number of clients each containing many data subjects. In cross-silo FL, usual notions of client-level DP are less suitable as real-world privacy regulations typically concern the in-silo data subjects rather than the silos themselves. In this work, we instead consider an alternative notion of silo-specific sample-level DP, where silos set their own privacy targets for their local examples. Under this setting, we reconsider the roles of personalization in federated learning. In particular, we show that mean-regularized multi-task learning (MR-MTL), a simple personalization framework, is a strong baseline for cross-silo FL: under stronger privacy requirements, silos are incentivized to federate more with each other to mitigate DP noise, resulting in consistent improvements relative to standard baseline methods.